Intelligibility Rating with Automatic Speech Recognition, Prosodic, and Cepstral Evaluation

نویسندگان

Tino Haderlein

Cornelia Moers

Bernd Möbius

Frank Rosanowski

Elmar Nöth

چکیده

For voice rehabilitation, speech intelligibility is an important criterion. Automatic evaluation of intelligibility has been shown to be successful for automatic speech recognition methods combined with prosodic analysis. In this paper, this method is extended by using measures based on the Cepstral Peak Prominence (CPP). 73 hoarse patients (48.3± 16.8 years) uttered the vowel /e/ and read the German version of the text “The North Wind and the Sun”. Their intelligibility was evaluated perceptually by 5 speech therapists and physicians according to a 5-point scale. Support Vector Regression (SVR) revealed a feature set with a human-machine correlation of r=0.85 consisting of the word accuracy, smoothed CPP computed from a speech section, and three prosodic features (normalized energy of word-pause-word intervals, F0 value at voice offset in a word, and standard deviation of jitter). The average human-human correlation was r=0.82. Hence, the automatic method can be a meaningful objective support for perceptual analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation

The standard for the analysis of distorted voices is perceptual rating of read-out texts or spontaneous speech. Automatic voice evaluation, however, is usually done on stable sections of sustained vowels. In this paper, text-based and established vowel-based analysis are compared with respect to their ability to measure hoarseness and its subclasses. 73 hoarse patients (48.3± 16.8 years) uttere...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification

As automatic speech processing has matured, research attention has expanded to paralinguistic speech problems that aim to detect beyond-the-words information. This paper focuses on the identification of seven speaker trait categories from the Interspeech Speaker Trait Challenge: likeability, intelligibility, openness, conscientiousness, extraversion, agreeableness, and neuroticism. Our approach...

متن کامل

Evaluation of Tracheoesophageal Substitute Voices Using Prosodic Features

Tracheoesophageal (TE) speech is a possibility to restore the ability to speak after laryngectomy, i.e. after the removal of the larynx. TE speech often shows low audibility and intelligibility which makes it a challenge for the patients to communicate. In speech rehabilitation the patient’s voice quality has to be evaluated. As no objective classification means exists until now and an automati...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Intelligibility Rating with Automatic Speech Recognition, Prosodic, and Cepstral Evaluation

نویسندگان

چکیده

منابع مشابه

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Automatic Rating of Hoarseness by Text-based Cepstral and Prosodic Evaluation

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification

Evaluation of Tracheoesophageal Substitute Voices Using Prosodic Features

عنوان ژورنال:

اشتراک گذاری